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Abstract 

The  design  of  the  POSTGRES  data  model  is  described.^he  data  model 
Is  a  relational  model  that  has  been  extended  with  abstract  data  types,  data 
of  type  procedure,  and  attribute  and  procedure  inheritance.  These  roechan- 
ims  can  be  used  to  simulate  a  wide  variety  of  semantic  and  object-oriented 
data  modeling  constructs  including  aggregation  and  generalization,  complex 
objects  with  shared  subobjects,  and  attributes  that  reference  tuples  in  other 
relations. 

1.  Introduction 

/  This  paper  describes  the  data  model  for  POSTGRES,  a  next-generation 
extensible  database  management  system  being  developed  at  the  University 
of  California  !StR86].  The  data  model  is  based  on  the  idea  of  extending  the 
relational  model  developed  by  Codd  lCod70]  with  general  mechanisms  that 
can  he  used  to  simulate  a  variety  of  semanctic  data  modeling  constructs 
The  mechanisms  include:  1)  abstract  data  types  (ADT’s),  2)  data  of  type  pro¬ 
cedure,  and  3)  rules.  These  mechanisms  can  be  used  to  support  complex 
objects  or  to  implement  a  shared  object  hierarchy  for  an  object-oriented  pro¬ 
gramming  language  (Row861.  Most  of  these  ideas  appeared  elsewhere 
(Sto85.Sto86a,Sto86b,Stel. 
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We  have  discovered  that  aome  semantic  constructs  that  were  not 
directly  supported  can  be  easily  added  to  the  system.  Consequently,  we 
have  made  several  changes  to  the  data  model  and  the  syntax  of  the  query 
language  that  are  documented  here.  The  changes  include  providing  support 
for  primary  keys,  inheritance  of  data  and  procedures,  and  attributes  that 
reference  tuples  in  other  relations. 

The  major  contribution  of  this  paper  is  to  show  that  inheritance  can  be 
added  to  a  relational  data  model  with  only  a  modest  number  of  changes  to 
the  model  and  the  implementation  of  the  system.  The  conclusion  that  we 
draw  from  this  result  is  that  the  m^jor  concepts  provided  in  an  object- 
oriented  data  model  (e.g.,  inheritance,  union  type  attributes,  and  support  for 
shared  subobjects)  can  be  cleanly  and  efficiently  supported  in  an  extensible 
relational  database  management  system.  The  features  used  to  support  these 
mechanisms  are  abstract  data  types  and  attributes  of  type  procedure. 

The  remainder  of  the  paper  describes  the  POSTGRES  data  model  and  is 
organized  as  follows.  Section  2  presents  the  data  model.  Section  3  describes 
the  attribute  type  system.  Section  4  describes  how  the  query  language  can 
be  extended  with  user-defined  procedures.  Section  6  compares  the  model 
with  other  data  models  and  section  6  summarizes  the  paper. 

2.  Data  Model 

A  database  is  composed  of  a  collection  of  relations  that  contain  tuples 
which  represent  real-world  entities  (e.g.,  documents  and  people)  or  relation¬ 
ships  (e.g.,  authorship).  A  relation  has  attributes  of  fixed  types  that 
represent  properties  Of  the  entities  and  relationships  (e.g.,  the  title  of  a 
document)  and  a  primary  key.  Attribute  types  can  be  atomic  (e.g.,  integer, 
floating  point,  or  boolean)  or  structured  (e.g.,  array  or  procedure).  The  pri¬ 
mary  key  is  a  sequence  of  attributes  of  the  relation,  whose  values  when 
taken  together,  uniquely  identify  each  tuple. 

A  simple  university  database  will  be  used  to  illustrate  the  model.  The 
following  command  defines  a  relation  that  represents  people: 

create  PERSON  ( Name  =  char[25],  Birthdate  =  date. 

Height  *  int4,  Weight  =  int4,  StreetAddress  *  char(25), 

City  =  char(25].  State  *  char[2]) 

This  command  defines  a  relation  and  creates  a  structure  for  storing  the 
tuples. 

The  definition  of  a  relation  may  optionally  specify  a  primary  key  and 
other  relations  from  which  to  inherit  attributes.  A  primary  key  is  a 
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combination  of  attributes  whose  values  together  uniquely  identify  each 
tuple.  The  key  is  specified  with  a  key-clause  as  follows: 

create  PERSON  (.  J 
.  key  (Name) 

Tuples  must  have  a  value  for  all  key  attributes.  The  specification  of  a  key 
may  optionally  include  the  name  of  an  operator  that  is  to  be  used  when 
comparing  two  tuples.  For  example,  suppose  a  relation  had  a  key  whose 
type  was  a  user-defined  ADT.  If  an  attribute  of  type  box  was  part  of  the  pri¬ 
mary  key.  the  comparison  operator  must  be  specified  since  different  box 
operators  could  be  used  distinguish  the  entries  (e.g.,  area  equals  or  box 
equality).  The  following  example  shows  the  definition  of  a  relation  with  a 
key  attribute  of  type  box  that  uses  area  equals  (AE): 

create  PlCTURE(Title  *  char[25],  Item  *  box) 
key  diem  using  AE) 

Data  inheritance  is  specified  with  an  inherits-clause.  Suppose,  for 
example,  that  people  in  the  university  database  are  employees  and/or 


Figure  1:  Person,  employee,  and  student  relation  hierarchy. 


students  and  that  different  attributes  are  to  be  defined  for  each  category. 
The  relation  for  each  category  includes  the  PEOPLE  attributes  and  other 
attributes  that  are  specific  to  the  category.  These  relations  can  be  defined 
by  replicating  the  PEOPLE  attributes  In  each  relation  definition  or  by 
Inheriting  them  for  the  definition  of  PEOPLE.  Figure  1  shows  the  relations 
and  an  inheritance  hierarchy  that  could  be  used  to  share  the  definition  of 
the  attributes.  The  commends  that  define  the  relations  are  shown  in  figure 
1 

A  relation  inherits  all  attributes  from  its  parents)  unless  an  attribute 
is  overriden  in  the  definition.  For  example,  the  EMPLOYEE  relation 
defined  in  figure  2  inherits  the  PERSON  attributes  Name,  Birthdate, 
Height ,  Weight,  StreetAddreee,  City,  and  State.  Key  specifications  are  also 
inherited  so  Name  is  also  the  key  for  EMPLOYEE. 

Relations  may  inherit  attributes  from  more  than  one  parent.  For  exam¬ 
ple.  STUDEMP  inherits  attributes  from  STUDENT  and  EMPLOYEE.  An 
inheritance  conflict  occurs  when  the  same  attribute  name  is  inherited  from 


create  PERSON  (Name  «=  char[25],  Birthdate  *=  date, 

Height  =  int4.  Weight  *  int4,  StreetAddress  =  char[25]. 

City  *  chari25].  State  «  char[2J) 
key  (Name) 

create  EMPLOYEE  (Dept  »  char[25].  Status  =  int2,  Mgr  =  char[25], 
JobTitle  *  char(25],  Salary  *  money) 
inherits  (PERSON) 

create  STUDENT  (Sno  =  chart  12],  Status  *=  int2,  Level  =  char(20]) 
Inherits  (PERSON) 

create  STUDEMP  (Is WorkStudy  *  bool) 
inherits  (STUDENT.  EMPLOYEE) 

Figure  2:  Relation  definitions. 


>  >  J  t 


more  than  one  parent  (e.g.,  STUDEMP  inherits  Statu*  from  EMPLOYEE 
and  STUDENT).  If  the  inherited  attributes  have  the  same  type,  an  attri¬ 
bute  with  the  type  is  included  in  the  relation  that  is  being  defined.  Other¬ 
wise,  the  declaration  is  disallowed. 1 

The  POSTGRES  query  language  is  a  generalized  version  of  QUEL 
[HSW75],  called  POSTQUEL.  QUEL  was  extended  in  several  directions. 
First,  POSTQUEL  has  a  from -clause  to  define  tuple-variables  rather  than  a 
range  command.  Second,  arbitrary  relation- valued  expressioqs.ittay  appear 
any  place  that  a  relation  name  could  appear  in  QUEL.  Third,  transitive 'clo¬ 
sure  and  execute  commands  pCue84]  have  been  added  to  the  language. 
And  lastly,  POSTGRES  maintains  historical  data  so  POSTQUEL  allows 
queries  to  be  run  on  past  database  states  or  on  any  data  that  was  in  the 
database  at  any  time.  These  extensions  are  described  in  the  remainder  of 
this  section. 

The  from -clause  was  added  to  the  language  so  that  tuple-variable 
definitions  for  a  query  could  be  easily  determined  at  compile-time.  This 
capability  was  needed  because  POSTGRES  will,  at  the  user’s  request,  com¬ 
pile  queries  and  save  them  in  the  catalogs.  The  from -clause  is  illustrated 
in  the  following  query  that  lists  all  work-study  students  who  are  sopho¬ 
mores: 

retrieve  (SE.name) 
from  SE  in  STUDEMP 
where  SEJsWorkStudy 

and  SE.Status  =  "sophomore” 

The  from-clause  specifies  the  set  of  tuples  over  which  a  tuple-variable  will 
range.  In  this  example,  the  tuple-variable  SE  ranges  over  the  set  of  student 
employees.  r . 

A  default  tuple-variable  with  the  same  name  is  defined  for  each  rela¬ 
tion  referenced  in  the  target-list  or  where-clause  of  a  query.  For  example, 
the  query  above  could  have  been  written: 


1  Most  attribute  inheritance  models  have  a  conflict  resolution  rule  that  selects  cot  e t 
the  conflicting  attributes.  We  chose  to  disallow  inheritance  because  we  could  act  discover 
aa  example  where  it  made  sente,  except  when  the  types  were  identical  On  the  ether  head, 
procedure  inheritance  (discussed  below)  does  use  e  conflict  resolution  rule  because  many  ex¬ 
amples  exist  in  which  one  procedure  ia  pnfcted 
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retrieve  (STUDEMPjoame) 
where  STUDEMPisWorkStudy 

and  STUDEMPStatus  *  "sophomore" 

Notice  that  the  attribute  IiWorkStudy  is  a  boolean-valued  attribute  so  it 
does  not  require  an  explicit  value  test  (e.g.,  STUDEMPJ$WorkStudy  «• 
mtnuTX 

The  set  of  tuples  that  a  tuple-variable  may  range  over  can  be  a  named 
relation  or  a  relation-expression.  For  example,  suppose  the  user  wanted  to 
retrieve  all  people  in  the  database  who  live  in  Berkeley  regardless  of 
whether  they  were  students,  employees,  student  employees,  or  people.  This 
query  can  be  written  as  follows: 

retrieve  (Pjaame) 
from  P  in  PEOPLE* 
where  P.city  *  "Berkeley" 

The  operator  specifies  the  relation  formed  by  taking  the  union  of  the 
relations  PEOPLE,  EMPLOYEE,  STUDENT ,  EMPLOYEE ,  and  STU - 
DEMP.  Relation  expressions  may  include  other  set  operators:  union  ({J), 
intersection  (p|)t  and  difference  (—).  For  example,  the  following  query 
retrieves  the  names  of  people  who  are  students  or  employees  but  not  student 
employees: 

retrieve  (P-name) 

from  P  in  (STUDENT  U  EMPLOYEE) 

POSTQUEL  also  provides  set  comparison  operators  and  a  relation- 
constructor  that  can  be  used  to  specify  some  difficult  queries  more  easily 
than  in  a  conventional  query  language.  For  example,  suppose  that  students 
could  have  several  majors.  The  natural  representation  for  this  data  is  to 
define  a  separate  relation: 

create  MAJORS(Sname  *  charl25],  Mname  *  char{25]) 

where  Sname  is  the  student's  name  and  Mname  is  the  major.  With  this 
representation,  the  following  query  retrieves  the  names  of  students  with  the 
tame  majors  as  Smith: 

retrieve  (Ml .Sname) 
from  Ml  in  MAJORS 

where  {(x.Mname)  from  x  in  MAJORS  where  x-Sname  *  Ml. Sname} 
C  {(x.Mname)  from  x  in  MAJORS  where  x .Sname ="Smith"l 

The  expressions  enclosed  in  set  symbols  ("{...}")  are  relation-constructors. 
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The  general  form  of  a  relation-constructor  *  is 
{ {target-list )  from  from-clause  where-clause } 
which  specifies  the  same  relation  as  the  query 

retrieve  (target-list) 
from  from -clause 
where  where-clause 

Note  that  a  tuple-variable  defined  in  the  outer  query  (e  g.,  Ml  in  the  query 
above)  can  be  used  within  a  relation-constructor  but  that  a  tuple-variable 
defined  in  the  relation-constructor  cannot  be  used  in  the  outer  query. 
Redefinition  of  a  tuple-variable  in  a  relation  constructor  creates  a  distinct 
variable  as  in  a  block-structured  programming  language  (e.g.,  PASCAL). 
Relation-valued  expressions  (including  attributes  of  type  procedure 
described  in  the  next  section)  can  be  used  any  place  in  a  query  that  a  named 
relation  can  be  used. 

Database  updates  are  specified  with  conventional  update  commands  as 
shown  in  the  following  examples: 

/•  Add  a  new  employee  to  the  database.  */ 

append  to  EMPLOYEE(name  =  value,  age  =  value,  ...) 

I*  Change  state  codes  using  MAP(01dCode,  NewCode).  */ 

replace  P(State  =  MAPJ^ewCode) 

from  P  in  PERSON* 

where  Pistate  =  MAP.OldCode 

/*  Delete  students  born  before  today.  */ 
delete  STUDENT 

where  STUDENT.Birthdate  >  todayO 
Deferred  update  semantics  are  used  for  all  updates  commands. 

POSTQUEL  supports  the  transitive  closure  commands  developed  in 
QUEL*  [Kue84].  A  "*”  command  continues  to  execute  until  no  tuples  are 
retrieved  (e.g.,  retrieve*)  or  updated  (e.g.,  append*,  delete*,  or  replace*). 
For  example,  the  following  query  creates  a  relation  that  contains  all 

9  Relation  constructors  arc  really  affrefete  functions  We  have  designed  a  nachantim 
to  support  extensible  aggregate  functions,  but  have  not  yet  worked  out  tbe  query  language 
syntax  and  semantics. 
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employees  who  work  for  Smith: 

retrieve*  into  SUBORDINATES(E.Name,  E.Mgr) 
from  E  in  EMPLOYEE,  S  in  SUBORDINATES 
where  E.Name  =  "Smith”  or  E.Mgr  =  S.Name 

•  r 

This  command  continues  to  execute  the  retrieve-into  command  until  there 
ore  no  changes  made  to  the  SUBORDINATES  relation. 

Lastly,  POSTGRES  never  deletes  or  updates  data  in  a  relation  so  that 
queries  can  be  executed  on  historical  data.  For  example,  the  following 
query  looks  for  students  who  lived  in  Berkeley  on  August  1, 1980: 

retrieve  (S.Name) 

from  S  in  STUDENTTAugust  1, 1980”) 
where  S.City  =  "Berkeley” 

The  date  specified  in  the  brackets  following  the  relation  name  specifies  the 
relation  at  the  designated  time.  The  date  can  be  specified  in  many  different 
formats  and  optionally  may  include  a  time  of  day. 

Queries  can  also  be  executed  on  all  data  that  is  currently  in  the  rela¬ 
tion  or  was  in  it  at  some  time  in  the  past  (i.e.,  all  data).  The  following 
query  retrieves  all  students  who  ever  lived  in  Berkeley: 

retrieve  (S.Name) 
from  S  in  STUDENTS 
where  S.City  =  "Berkeley" 

The  notation  "[]”  can  be  appended  to  any  relation  name.  The  DBMS  saves 
all  data  unless  requested  not  to  by  the  user.  Data  can  be  purged  explicitly 
(e.g.,  "purge  data  before  a  given  date”)  or  implicitly  (e.g.,  "purge  data  over  1 
year  old”). 

The  query  above  only  examines  students  who  were  not  student  employ¬ 
ees.  To  search  the  set  of  all  students,  the  from-dause  would  be 

..from  S  in  STUDENT*!]... 

Queries  can  also  be  specified  on  data  that  was  in  the  relation  during  a 
given  time  period. 

The  time  period  is  specified  by  giving  a  start  and  end  time  as  shown  in 
the  following  query  that  retrieves  students  who  lived  in  Berkeley  at  any 
time  in  August  1980: 


| 
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retrieve  ©.Name) 

from  8  in  STUDENTTAugust  1, 1980","August  81, 198 CT] 
where  S.City  *  "Berkeley** 

Shorthand  notations  are  supported  for  all  tuples  in  a  relation  up  to  some 
date  (e.f.,  STUDENT*[,nAugutt  1, 1980”])  or  from  some  date  to  the  present 
(e.g.,  STUDENT* r August  J.  1980” J). 

Finally,  every  relation  has  implicitly  defined  attributes  that  represent 
the  time  a  tuple  was  appended  to  the  relation  {TMIN)  and  the  time  it  was 
deleted  or  replaced  iTMAX).  These  attributes  and  operators  on  the  date 
data  type  can  be  used  to  find  the  names  of  students  who  lived  in  Berkeley 
less  than  30  days: 

retrieve  ©.Name) 
from  S  in  STUDENT*!) 
w  here  S.City  =  "Berkeley" 
and  date(S.TMAX  -  S.TMIN)  <  "30  days" 

The  predefined  date  function  takes  a  value  of  the  type  used  to  represent 
time  internally  in  the  system  and  converts  it  to  a  date  data  type. 

POSTGRES  also  supports  versions  of  relations.  A  version  can  be 
created  from  a  relation  or  a  snapshot.  A  version  is  created  by  specifying  the 
base  relation  as  shown  in  the  command 

create  version  MYPEOPLE  from  PEOPLE 

which  creates  a  version,  named  MYPEOPLE,  that  is  derived  from  the  PEO¬ 
PLE  relation.  Data  can  be  retrieved  from  and  updated  in  a  version  just  like 
a  relation.  Updates  to  a  version  do  not  modify  the  underlying  relation  but 
updates  to  the  underlying  relation  are  visible  through  the  version  unless  the 
value  has  been  modified  in  the  version.  For  example,  if  George’s  birthdate 
is  changed  in  MYPEOPLE,  a  replace  command  that  changes  his  birthdate 
in  PEOPLE  will  not  be  visible  in  MYPEOPLE. 

If  the  user  does  not  want  updates  to  the  base  relation  to  propagate  to 
the  version,  he  can  create  a  version  of  a  snapshot.  For  example,  the  follow¬ 
ing  command  creates  a  version  of  PEOPLE  that  is  derived  from  the  current 
contents  of  the  relation: 

create  version  YOURPEOPLE  from  PEOPLE[now(» 

The  procedure  now  returns  the  current  date  and  time. 

A  merge  command  is  provided  that  will  merge  the  changes  made  in  a 
version  back  into  the  underlying  relation.  An  example  of  this  command  is 
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merge  YOURPEOPLE  into  PEOPLE 

that  will  merge  the  changes  made  to  YOURPEOPLE  back  into  PEOPLE. 
The  merge  command  uses  a  semi-automatic  procedure  to  resolve  updates  to 
the  underlying  relation  and  the  version  that  conflict  lGae84). 

f 

This  section  described  most  of  the  data  definition  and  data  manipula¬ 
tion  commands  in  POSTQUEL.  The  commands  that  were  not  described  are 
the  commands  for  defining  rules,  utility  commands  that  only  affect  the  per¬ 
formance  of  the  system  (e.g.,  define  index  and  modify),  and  other  miscel¬ 
laneous  utility  commands  (e.g.,  destroy  and  copy).  The  next  section 
describes  the  type  system  for  relation  attributes. 

3.  Data  Types 

POSTGRES  provides  a  collection  of  atomic  and  structured  types.  The 
predefined  atomic  types  include:  int2,  int4,  fioat4,  floats,  bool,  char,  and 
date.  The  standard  arithmetic  and  comparison  operators  are  provided  for 
the  numeric  and  date  data  types  and  the  standard  string  and  comparison 
operators  for  character  arrays.  Users  can  extend  the  system  by  adding  new 
atomic  types  using  an  abstract  data  type  (ADT)  definition  facility. 

All  atomic  data  types  are  defined  to  the  system  as  ADT’s.  An  ADT  is 
defined  by  specifying  the  type  name,  the  length  of  the  internal  representa¬ 
tion  in  bytes,  procedures  for  converting  from  an  external  to  internal 
representation  for  a  value  and  from  an  internal  to  external  representation, 
and  a  default  value.  The  command 

define  type  int4  is 

(InternalLength  =  4,  InputProc  =  CharToInt4, 

OutputProc  =  lnt4ToChar,  Default  *=  "(T) 

defines  the  type  int4  which  is  predefined  in  the  system.  CharTalnt4  and 
IrU4ToChar  are  procedures  that  are  coded  in  a  conventional  programming 
langauge  (e.g.,  C)  and  defined  to  the  system  using  the  commands  described 
in  section  4. 

Operators  on  ADTs  are  defined  by  specifying  the  the  number  and  type 
of  operands,  the  return  type,  the  precedence  and  associativity  of  the  opera¬ 
tor,  and  the  procedure  that  implements  it.  For  example,  the  command 

define  operator  "+”(int4,  int4)  returns  int4  is 

(Proc  =  Plus,  Precedence  *  6,  Associativity  *  Taft") 

defines  the  plus  operator.  Precedence  is  specified  by  a  number.  Larger 
numbers  imply  higher  precedence.  The  predefined  operators  have  the  pre¬ 
cedences  shown  in  figure  3.  These  precedences  can  be  changed  by  changing 
the  operator  definitions.  Associativity  is  either  left  or  right  depending  on 
the  semantics  desired.  This  example  defined  an  operator  denoted  by  a  sym¬ 
bol  (i.e.,  *’+”).  Operators  can  also  be  denoted  by  identifiers  as  shown  below. 
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Figure  3:  Precedence  of  predefined  operators. 


Another  example  of  an  AJDT  definition  is  the  following  command  that 
defines  an  ADT  that  represents  boxes: 

define  type  box  is 

(InternalLength  *=  16,  InputProc  *  CharToBox, 

OutputProc  =  BoxToChar,  Default  *  "") 

The  external  representation  of  a  box  is  a  character  string  that  contains  two 
points  that  represent  the  upper-left  and  lower-right  corners  of  the  box. 
With  this  representation,  the  constant 

"20,50:10, 70n 

describes  a  box  whose  upper-left  corner  is  at  (20,  60)  and  lower-right  corner 
is  at  (10,  70).  CharToBox  takes  a  character  string  like  this  one  and  returns 
a  16  byte  representation  of  a  box  (e.g.,  4  bytes  per  x-  or  y-coordinate  value). 
BoxToChar  is  the  inverse  of  CharToBox 

Comparison  operators  can  be  defined  on  ADT's  that  can  be  used  in 
access  methods  or  optimised  in  queries.  For  example,  the  definition 
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define  operator  AECbox,  box)  returns  bool  Is 

(Proc  *  BoxAE,  Precedence  *  3,  Associativity  *  "left". 

Sort  =  BoxArea,  Hashes,  Restrict  *  AERSelectivity, 

Join  *  AEJSelectivity,  Negator  *  BoxAreaNE) 

defines  an  operator  "area  equals"  on  boxes.  In  addition  to  the  semantic 
information  about  the  operator  itself,  this  specification  includes  information 
needed  to  build  indexes  and  to  optimize  queries  using  the  operator.  For 
example,  suppose  the  PICTURE  relation  was  defined  by 

create  PICTURE  (Title  *  charD,  Item  *  box) 
and  the  query 

retrieve  (PICTURE.aB) 

where  PICTURE.Item  AE  "60,100:100,50" 

was  being  processed.  The  Sort  attribute  of  the  operator  specifies  the  pro¬ 
cedure  to  be  used  if  a  merge-sort  join  strategy  was  selected  to  implement  the 
query.  It  also  specifies  the  procedure  to  use  when  building  an  ordered  index 
(e.g.,  B-Tree)  on  a  box  attribute  in  a  relation.  The  Hashes  attribute  indi¬ 
cates  that  this  operator  can  be  used  to  build  a  hash  index  on  a  box  attribute 
in  a  relation.  Note  that  either  type  of  index  can  be  used  to  optimize  the 
query  above.  The  Restrict  and  Join  attributes  specify  the  procedure  that  is 
to  be  called  by  the  query  optimizer  to  compute  the  restrict  and  join  selectivi* 
ties,  respectively,  of  a  clause  involving  the  operator.  These  selectivity  attri¬ 
butes  identify  procedures  to  be  called  that  will  return  a  floating  point  value 
between  0.0  and  1.0  that  specify  the  selectivity  given  the  operator.  Lastly, 
the  Negator  attribute  specifies  the  procedure  that  is  to  be  used  to  compare 
two  values  when  a  query  predicate  requires  the  operator  to  be  negated  as  in 

retrieve  (PICTURE.all) 

where  not  (PICTUREJtera  AE  "50,100:100,50") 

The  define  operator  command  also  may  specify  a  procedure  that  can  be 
used  if  the  query  predicate  includes  an  operator  that  is  not  commutative. 
For  example,  the  commutator  procedure  for  "area  less  than"  (ALT)  is  the 
procedure  that  implements  "area  greater  than  or  equal"  (AGE).  More 
details  on  the  use  of  these  attributes  is  given  elsewhere  [Sto86a). 

Type-constructors  are  provided  to  define  structured  types  (e.g.,  arrays 
and  procedures)  that  can  be  used  to  represent  complex  data.  An  array  type- 
constructor  can  be  used  to  define  a  variable-  or  fixed-size  array.  A  fixed-size 
array  is  declared  by  specifying  the  element  type  and  upper  bound  of  the 
array  as  illustrated  by 
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create  PERSON(Name  *  eharl25D 

which  defines  an  array  of  twenty-five  characters.  The  elements  of  the  array 
are  referenced  by  indexing  the  attribute  by  an  integer  between  1  and  25 
(eg.,  "PERSON NameliT  references  the  fourth  character  in  the  person's 
name). 

A  variable-size  array  is  specified  by  omitting  the  upper  bound  in  the 
type  constructor.  For  example,  a  variable-sized  array  of  characters  is 
specified  by  "charil ."  Variable-size  arrays  are  referenced  by  indexing  the 
attribute  by  an  integer  between  1  and  the  current  upper  bound  of  the  array. 
The  predefined  function  size  returns  the  current  upper  bound.  POSTGRES 
does  not  impose  a  limit  on  the  size  of  a  variable-size  array.  Built-in  func¬ 
tions  are  provided  to  append  arrays  and  to  fetch  array  slices.  For  example, 
two  character  arrays  can  be  appended  using  the  concatenate  operator  C* t") 
and  an  array  slice  containing  characters  2  through  15  in  an  attribute  named 
x  can  be  fetched  by  the  expression  "xUilS)." 

The  second  type-constructor  allows  values  of  type  procedure  to  be  stored 
in  an  attribute.  Procedure  values  are  represented  by  a  sequence  of  POST¬ 
QUEL  commands.  The  value  of  an  attribute  of  type  procedure  is  a  relation 
because  that  is  what  a  retrieve  command  returns.  Moreover,  the  value 
may  include  tuples  from  different  relations  (i.e.,  of  different  types)  because  a 
procedure  composed  of  two  retrieve  commands  returns  the  union  of  both 
commands.  We  call  a  relation  with  different  tuple  types  a  multirelation. 
The  POSTGRES  programming  language  interface  provides  a  cursor-like 
mechanism,  called  a  portal ,  to  fetch  values  from  multirelations  (StR6€l. 
However,  they  are  not  stored  by  the  system  (i.e.,  only  relations  are  stored). 

The  system  provides  two  kinds  of  procedure  type-constructors:  variable 
and  parameterized.  A  variable  procedure-type  allows  a  different  POST¬ 
QUEL  procedure  to  be  stored  in  each  tuple  while  parameterized  procedure- 
types  store  the  same  procedure  in  each  tuple  but  with  different  parameters. 
We  will  illustrate  the  use  of  a  variable  procedure-type  by  showing  another 
way  to  represent  student  majors.  Suppose  a  DEPARTMENT  relation  was 
defined  with  the  following  command: 

create  DEPARTMENT(Name  *  char!251,  Chair  *  chart 25],  J 

A  student's  majorfs)  can  then  be  represented  by  a  procedure  in  the  STU¬ 
DENT  relation  that  retrieves  the  appropriate  DEPARTMENT  tuple(s).  The 
Majort  attribute  would  be  declared  as  follows: 

create  STUDENTS..,  Majors  s  postquel,  ...) 

Data  type  pottquel  represents  a  procedure-type.  The  value  in  Mqjort  will  be 


13 


•  query  that  fetches  the  department  relation  tuples  that  represent  the 
student’s  minors.  The  following  command  appends  a  student  to  the  data¬ 
base  who  has  a  double  major  in  mathematics  and  computer  science: 

append  STUDENT(  Name  «  -Smith",  _ 

Majors  *  "retrieve  (D.a!l) 

from  D  in  DEPARTMENT 
where  D.Name  *  "Mathematics" 

or  DName  *  "Computer  Science"") 

A  query  that  references  the  Major*  attribute  returns  the  string  that 
contains  the  POSTQUEL  commands.  However,  two  notations  are  provided 
that  will  execute  the  query  and  return  the  result  rather  than  the  definition. 
First,  nested-dot  notation  implicitly  executes  the  query  as  illustrated  by 

retrieve  (S.Name,  S  .Majors  .Name) 
from  S  fin  STUDENT 

which  prints  a  list  of  names  and  majors  of  students.  The  result  of  the  query 
in  Major*  is  implicitly  joined  with  the  tuple  specified  by  the  rest  of  the 
target-list.  In  other  words,  if  a  student  has  two  majors,  this  query  will 
return  two  tuples  with  the  Nome  attribute  repeated.  The  implicit  join  is 
performed  to  guarantee  that  a  relation  is  returned. 

The  second  way  to  execute  the  query  is  to  use  the  execute  command. 
For  example,  the  query 

execute  (S.Majors) 
from  S  in  STUDENT 
where  SJName  *  "Smith" 

returns  a  relation  that  contains  DEPARTMENT  tuples  for  all  of  Smith’s 
majors. 

Parameterized  procedure-types  are  used  when  the  query  to  be  stored  in 
an  attribute  is  nearly  the  same  for  every  tuple.  The  query  parameters  can 
be  taken  from  other  attributes  in  the  tuple  or  they  may  be  explicitly 
specified.  For  example,  suppose  an  attribute  in  STUDENT  was  to  represent 
the  student's  current  class  list.  Given  the  following  definition  for  enroll¬ 
ments: 

create  ENROLLMENTCStudent  *  charI25],  Class  *  char(25]) 

Bill's  class  list  can  be  retrieved  by  the  query 
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retrieve  (ClassName  *  E.Claas) 
from  E  in  ENROLLMENT 
where  E  Student  *  "Bill” 

This  query  will  be  the  seme  for  every  student  except  for  the  constant  that 
specifies  the  student’s  name. 

A  parameterized  procedure-type  could  be  defined  to  represent  this  query 
as  fellows: 

define  type  classes  is 

retrieve  (ClassName  *  E  .Class) 
from  E  in  ENROLLMENT 
where  EStudent  »  $<Name 

end 

The  dollar-sign  symbol  Cl”)  refers  to  the  tuple  in  which  the  query  is  stored 
(i.e.,  the  current  tuple).  The  parameter  for  each  instance  of  this  type  (i.e.,  a 
query)  is  the  Name  attribute  in  the  tuple  in  which  the  instance  is  stored. 
This  type  is  then  used  in  the  create  command  as  follows 

create  STUDENT(Name  =  charI25], ...»  Cl&ssList  *=  classes) 

to  define  an  attribute  that  represents  the  student’s  current  class  list.  This 
attribute  can  be  used  in  a  query  to  return  a  list  of  students  and  the  classes 
they  are  taking: 

retrieve  (S.Name,  S.ClassList.ClassName) 

Notice  that  for  a  particular  STUDENT  tuple,  the  expression  " S.Name ”  in 
the  query  refers  to  the  name  of  that  student.  The  symbol  can  be 
thought  of  as  a  tuple-variable  bound  to  the  current  tuple. 

Parameterized  procedure-types  are  extremely  useful  types,  but  some¬ 
times  it  is  inconvenient  to  store  the  parameters  explicitly  as  attributes  in 
the  relation.  Consequently,  a  notation  is  provided  that  allows  the  parame¬ 
ters  to  be  stored  in  the  procedure-type  value.  This  mechanism  can  be  used 
to  simulate  attribute  types  that  reference  tuples  in  other  relations.  For 
example,  suppose  you  wanted  a  type  that  referenced  a  tuple  in  the 
DEPARTMENT  relation  defined  above.  This  type  can  be  defined  as  follows: 

define  type  DEPARTMENT(int4)  Is 
retrieve  (DEPARTMENT.aD) 
where  DEPARTMENTS  *  $1 

end 

The  relation  jiame  can  be  used  for  the  type  name  because  relations,  types, 
and  procedures  have  separate  name  spaces.  The  query  in  type 
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DEPARTMENT  will  retrieve  a  specific  department  tuple  given  a  unique 
object  identifier  {oid)  of  the  tuple.  Each  relation  has  an  implicitly  defined 
attribute  named  oid  that  contains  the  tuple’s  unique  identifier.  The  aid 
attribute  can  be  accessed  but  not  updated  by  user  queries.  Oid  values  are 
created  and  maintained  by  the  POSTGEES  storage  system  [Sto87].  The  for¬ 
mal  argument  to  this  procedure-type  is  the  type  of  an  object  identifier.  The 
parameter  is  referenced  inside  the  definition  fay  m$nm  where  n  is  the  parame¬ 
ter  number. 

An  actual  argument  is  supplied  when  a  value  is  assigned  to  an  attri¬ 
bute  of  type  DEPARTMENT.  For  example,  a  COURSE  relation  can  be 
defined  that  represents  information  about  a  specific  course  including  the 
department  that  offers  it  The  create  command  is: 

create  COURSE(Title  *  charl253.  Dept  *  DEPARTMENT, ...) 

The  attribute  Dept  represents  the  department  that  offers  the  course.  The 
following  query  adds  a  course  to  the  database: 

append  COURSE(Title  =  Introductory  Programming", 

Dept  =  DEPARTMENT(D.oid)) 
from  D  in  DEPARTMENT 
where  D.Name  *  "computer  science" 

The  procedure  DEPARTMENT  called  in  the  target-list  is  implicitly  defined 
by  the  "define  type"  command.  It  constructs  a  value  of  the  specified  type 
given  actual  arguments  that  are  type  compatible  with  the  formal  argu¬ 
ments,  in  this  case  an  int4. 

Parameterized  procedure-types  that  represent  references  to  tuples  in  a 
specific  relation  are  so  commonly  used  that  we  plan  to  provide  automatic 
support  for  them.  First,  every  relation  created  will  have  a  type  that 
represents  a  reference  to  a  tuple  implicitly  defined  similar  to  the  DEPART - 
MENT  type  above.  And  second,  it  will  be  possible  to  assign  a  tuple-variable 
directly  to  a  tuple  reference  attribute.  In  other  words,  the  assignment  to 
the  attribute  Dept  that  is  written  in  the  query  above  as 

...  Dept  *  DEPARTMENT(D.oid)  ... 
can  be  written  as 

~  Dept  «  D ... 

Parameterized  procedure-types  can  also  be  used  to  implement  a  type 
that  references  a  tuple  in  an  arbitrary  relation.  The  type  definition  is: 
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define  type  tuplefcharQ,  int4)  le 
retrieve  (fl-aH) 
where  fl.oid  *  $2 

end 

/ 

The  first  argument  is  the  name  of  the  relation  and  the  second  argument  is 
the  ouf  of  the  desired  tuple  in  the  relation.  In  effect,  this  type  defines  a 
reference  to  ah  arbitrary  tuple  in  the  database. 

The  procedure-type  tuple  can  be  used  to  create  a  relation  that 
represents  people  who  help  with  fund  raising: 

create  VOLUNTEER(Person  *=  tuple,  TimeAvailable  =  integer, ...) 

Because  volunteers  may  be  students,  employees,  or  people  who  are  neither 
students  nor  employees,  the  attribute  Perton  must  contain  a  reference  to  a 
tuple  in  an  arbitrary  relation.  The  following  command  appends  all  students 
to  VOLUNTEER : 

append  VOLUNTEERCPerson  =  tuple(relation(S),  S.oid)) 
from  S  in  STUDENT* 

The  predefined  function  relation  returns  the  name  of  the  relation  to  which 
the  tuple-variable  S  is  bound. 

The  type  tuple  will  also  be  special-cased  to  make  it  more  convenient. 
Tuple  will  be  a  predefined  type  and  it  will  be  possible  to  assign  tuple- 
variables  directly  to  attributes  of  the  type.  Consequently,  the  assignment  to 
Perton  written  above  as 

...  Person  as  relation(S),  S.oid  ... 
can  be  written 

...  Person  *=  S  ... 

We  expect  that  as  we  get  more  experience  with  POSTGRES  applications 
that  more  types  may  be  special-cased. 

4.  User-Defined  Procedures 

This  section  describes  language  constructs  for  adding  user-defined  pro¬ 
cedures  to  POSTQUEL.  User-defined  procedures  are  written  in  a  conven¬ 
tional  programming  language  and  are  used  to  implement  ADT  operators  or 
to  move  a  computation  from  a  front-end  application  process  to  the  back-end 
DBMS  process. 

Moving  a  computation  to  the  back-jnd  opens  up  possibilities  for  the 
DBMS  to  precompute  a  query  that  includes  the  computation.  For  example, 
suppose  that  a  front-end  application  needed  to  fetch  the  definition  of  a  form 
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from  ft  database  and  to  construct  ft  main-memory  data  structure  that  the 
run-time  forms  system  used  to  display  the  form  on  the  terminal  screen  for 
data  entry  or  display.  A  conventional  relation  database  design  would  store 
the  form  components  (e.g.,  titles  and  field  definitions  for  different  types  of 
fields  such  as  scalar  fields,  table  fields,  and  graphics  fields)  in  many 
different  relations.  An  example  database  design  is  shown  in  figure  4.  The 
query  that  fetches  the  form  from  the  database  must  execute  at  least  one 
query  per  table  and  sort  through  the  return  tuples  to  construct  the  main- 
memory  data  structure.  This  operation  must  take  less  than  two  seconds  for 
an  interactive  application.  Conventional  relational  DBMS’s  cannot  satisfy 
this  time  constraint 

Our  approach  to  solving  this  problem  is  to  move  the  computation  that 
constructs  the  main-memory  data  structure  to  the  database  process.  Sup¬ 
pose  the  procedure  CorutructForm  built  the  data  structure  given  the  name 
of  a  form.  Using  the  parameterized  procedure-type  mechanism  defined 
above  an  attribute  can  be  added  to  the  FORM  relation  that  stores  the  form 
representation  computed  by  this  procedure.  The  commands 


create  FORM(FormName,  ~) 

create  FIELDS(FormName,  FieldName,  Origin,  Height,  Width, 
FieldKind,  ...) 

create  SCALARFIELDCFormName,  FieldName,  DataType, 
DisplayFormat,  -.) 

create  TABLEFIELDiFormName,  FieldName,  NumberOfRows,  ...) 
create  TABLECOLUMNS(FormName,  FieldName,  ColumnName, 
Height,  Width,  FieldKind,  _.) 


Figure  4:  Database  design  for  storing  form  definitions. 
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define  type  formrep  Is 

retrieve  (rep  *  ConstructF orm(I.FormName)) 

end 

eddattribute  (FormName,  FormDataStructure  =  formrep) 

*  to  FORM 

define  the  procedure  type  and  add  an  attribute  to  the  FORM  relation. 

The  advantage  of  this  representation  ia  that  POSTGRES  can  precom¬ 
pute  the  answer  to  a  procedure-type  attribute  and  store  it  in  the  tuple.  By 
precomputing  the  main-memory  data  structure  representation,  the  form  can 
be  fetched  from  the  database  by  a  single-tuple  retrieve: 

retrieve  (a  sc  FORM. FormDataStructure) 
where  FORM  .F ormN ame  *  "fbo" 

The  real-time  constraint  to  fetch  and  display  a  form  can  be  easily  met  if  all 
the  program  must  do  is  a  single-tuple  retrieve  to  fetch  the  data  structure 
and  call  the  library  procedure  to  display  it.  This  example  illustrates  the 
advantage  of  moving  a  computation  (i.e.,  constructing  a  main-memory  data 
structure)  from  the  application  process  to  the  DBMS  process. 

A  procedure  is  defined  to  the  system  by  specifying  the  names  and  types 
of  the  arguments,  the  return  type,  the  language  it  is  written  in,  and  where 
the  source  and  object  code  is  stored.  For  example,  the  definition 

define  procedure  AgelnYears(date)  returns  int4  is 
(language  *  "C",  filename  *  "AgelnYears") 

defines  a  procedure  AgeJnYeart  that  takes  a  date  value  and  returns  the  age 
of  the  person.  The  argument  and  return  types  are  specified  using 
POSTGRES  types.  When  the  procedure  is  called,  it  is  passed  the  arguments 
in  the  POSTGRES  internal  representation  for  the  type.  We  plan  to  allow 
procedures  to  be  written  in  several  different  languages  including  C  and  Lisp 
which  are  the  two  languages  being  used  to  implement  the  system. 

POSTGRES  stores  the  information  about  a  procedure  in  the  system 
catalogs  and  dynamically  loads  the  object  code  when  it  is  called  in  a  query. 
The  following  query  uses  the  AgeJnYeart  procedure  to  retrieve  the  names 
and  ages  of  all  people  in  the  example  database: 

retrieve  (P.N ame.  Age  *  AgelnYearsfP.Birthdate)) 
from  P  in  PERSON* 

User-defined  procedures  ean  also  take  tuple-variable  arguments.  For 
example,  the  following  command  defines  a  procedure,  called  Comp,  that 
takes  an  EMPLOYEE  tuple  and  computes  the  person’s  compensation 
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according  to  some  formula  that  involves  several  attributes  in  the  tuple  (e.g., 
the  employee’s  status,  job  title,  and  salary): 

define  procedure  Comp(EMPLOYEE)  returns  int4 
is  (language  *=  *C",  filename  *  "Compl”) 

Recall  that  a  parameterized  procedure-type  is  defined  for  each  relation 
automatically  so  the  type  EMPLOYEE  represents  a  reference  to  a  tuple  in 
the  EMPLOYEE  relation.  This  procedure  is  called  in  the  following  query: 

retrieve  (E-Name,  Compensation  “  CompCE)) 
from  E  in  EMPLOYEE 

The  C  function  that  implements  this  procedure  is  passed  a  data  structure 
that  contains  the  names,  types,  and  values  of  the  attributes  in  the  tuple. 

User-defined  procedures  can  be  passed  tuples  in  other  relations  that 
inherit  the  attributes  in  the  relation  declared  as  the  argument  to  the  pro¬ 
cedure.  For  example,  the  Comp  procedure  defined  for  the  EMPLOYEE  rela¬ 
tion  can  be  passed  a  STUDEMP  tuple  as  in 

retrieve  (SE.Name,  Compensation  =  Comp(SE)) 
from  SE  in  STUDEMP 

because  STUDEMP  inherits  data  attributes  from  EMPLOYEE . 

The  arguments  to  procedures  that  take  relation  tuples  as  arguments 
must  be  passed  in  a  self-describing  data  structure  because  the  procedure  can 
be  passed  tuples  from  different  relations.  Attributes  inherited  from  other 
relations  may  be  in  different  positions  in  the  relations.  Moreover,  the 
values  passed  for  the  same  attribute  name  may  be  different  types  (e.g.,  the 
definition  of  an  inherited  attribute  may  be  overriden  with  a  different  type). 
The  self-describing  data  structure  is  a  list  of  arguments,  one  per  attribute  in 
the  tuple  to  be  passed,  with  the  following  structure 

(AttributeName,  AttributeType,  AttributeValue) 

The  procedure  code  will  have  to  search  the  list  to  find  the  desired  attribute. 
A  library  of  routines  is  provided  that  will  hide  this  structure  from  the  pro¬ 
grammer.  The  library  will  include  routines  to  get  the  type  and  value  of  an 
attribute  given  the  name  of  the  attribute.  For  example,  the  following  code 
letches  the  value  of  the  Birthdate  attribute: 

GetValuerBirthdate") 

The  problem  of  variable  argument  lists  arises  in  all  object-oriented  program¬ 
ming  languages  and  similar  solutions  are  used. 


The  model  for  procedure  inheritance  is  nearly  identical  to  method 
inheritance  in  otgect-oriented  programming  langauges  [StB86].  Procedure 
inheritance  uses  the  data  inheritance  hierarchy  and  similar  inheritance 
rules  except  that  a  rule  is  provided  to  select  a  procedure  when  an  inheri¬ 
tance  conflict  arises.  For  example,  suppose  that  a  Comp  procedure  was 
defined  lor  STUDENT  as  well  as  for  EMPLOYEE.  The  definition  of  the 
second  procedure  might  be: 

define  procedure  Comp(STUDENT)  returns  int4 
Is  (language  *=  "C",  filename  *  "Comp2”) 

A  conflict  arises  when  the  query  on  STUDEMP  above  is  executed  because 
the  system  does  not  know  which  Comp  procedure  to  call  (i.e.,  the  one  for 
EMPLOYEE  or  the  one  for  STUDENT).  The  procedure  called  is  selected 
from  among  the  procedures  that  take  a  tuple  from  the  relation  specified  by 
the  actual  argument  STUDEMP  or  any  relation  from  which  attributes  in 
the  actual  argument  are  inherited  (e.g.,  PERSON ,  EMPLOYEE,  and  STU¬ 
DENT). 

Each  relation  has  an  inheritance  precedence  list  (IPL)  that  is  used  to 
resolve  the  conflict.  The  list  is  constructed  by  starting  with  the  relation 
itself  and  doing  a  depth-first  search  up  the  inheritance  hierarchy  starting 
with  the  first  relation  specified  in  the  inherits-dause.  For  example,  the 
inherits-clause  for  STUDEMP  is 

...  inherits  (STUDENT,  EMPLOYEE) 
and  its  IPL  is 

(STUDEMP,  STUDENT,  EMPLOYEE,  PERSON) 

PERSON  appears  after  EMPLOYEE  rather  than  after  STUDENT  where  it 
would  appear  in  a  depth-first  search  because  both  STUDENT  and 
EMPLOYEE  inherit  attributes  from  PERSON  (see  figures  1  and  2).  In 
other  words,  all  but  the  last  occurrence  of  a  relation  in  the  depth-first  order¬ 
ing  of  the  hierarchy  is  deleted.* 

When  a  procedure  is  called  and  passed  a  tuple  as  the  first  argument, 
the  actual  procedure  invoked  is  the  first  definition  found  with  the  same 
name  when  the  procedures  that  take  arguments  from  the  relations  in  the 

s  We  art  using  a  rule  that  is  rinllir  to  the  rule  for  tbs  new  Com  moo  Lisp  object  mode) 
(Boe86].  It  is  actually  slightly  more  complicated  than  described  here  is  order  to  eliminate 
some  nasty  cases  that  arise  whan  there  ore  cycles  in  the  inheritance  hierarchy. 
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ILP  of  the  argument  are  searched  in  order.  In  the  example  above,  the  Comp 
procedure  defined  for  STUDENT  is  called  because  there  is  no  procedure 
named  Comp  defined  for  STUDEMP  and  STUDENT  is  the  next  relation  in 

the  IPL.  , 

The  implementation  of  this  procedure  selection  rule  is  relatively  easy. 
Assume  that  two  system  catalogs  are  defined: 

PROCDEF(ProcName,  ArgName,  Procld) 

IPURelationName,  IPLEntry,  SeqNo) 

where  PROCDEF  has  an  entry  for  each  procedure  defined  and  IPL  main* 
tains  the  precedence  lists  for  all  relations.  The  attributes  in  PROCDEF 
represent  the  procedure  name,  the  argument  type  name,  and  the  unique 
identifier  for  the  procedure  code  stored  in  another  catalog.  The  attributes  in 
IPL  represent  the  relation,  an  IPL  entry  for  the  relation,  and  the  sequence 
number  for  that  entry  in  the  IPL  of  the  relation.  With  these  two  catalogs, 
the  query  to  find  the  correct  procedure  for  the  call 

Comp(STUDEMP) 

is4 

retrieve  (PProdd) 
from  P  in  PROCDEF,  I  in  IPL 
where  P.ProcName  *  "Comp" 
and  I.RelationName  *=  "STUDEMP" 
and  I-IPLEntry  =  P ArgName 

and  I.SeqNo  *  MINd.SeqNo  by  I.RelationName,  PArgName 
where  LIPLEntry  =  PArgName) 

This  query  can  be  precomputed  to  speed  up  procedure  selection. 

In  summary,  the  major  changes  required  to  support  procedure  inheri¬ 
tance  is  1)  allow  tuples  as  arguments  to  procedures,  2)  define  a  representa¬ 
tion  for  variable  argument  lists,  and  3)  implement  a  procedure  selection 
mechanism.  This  extension  to  the  relational  model  is  relatively  straightfor¬ 
ward  and  only  requires  a  small  number  of  changes  to  the  DBMS  implemen¬ 
tation. 


*  TkU  query  mm  •  QUEL+tyl*  aggregate  function 
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5.  Comparison  to  Other  Data  Models 

Dus  section  compares  the  POSTGRES  data  model  to  semantic,  func¬ 
tional,  and  object-oriented  data  models. 

Semantic  and  functional  data  models 

[Dae85,HaM81,Mye80,Shi81,SmS77,Zan3.]  do  not  provide  the  flexibility 
provided  by  the  model  described  here.  They  cannot  easily  represent  data 
with  uncertain  structure  (e.g.,  objects  with  shared  subobjects  that  have 
different  types). 

Modeling  ideas  oriented  toward  complex  objects  [HaL82,LoP83]  cannot 
deal  with  objects  that  have  a  variety  of  shared  subobjects.  POSTGRES  uses 
procedures  to  represent  shared  subobjects  which  does  not  have  limitation  on 
the  types  of  subobjects  that  are  shared.  Moreover,  the  nested-dot  notation 
allows  convenient  access  to  selected  subobjects,  a  feature  not  present  in 
these  systems. 

Several  proposals  have  been  made  to  support  data  models  that  contain 
non-first  normal  form  relations  [Bae86,Dae86,Sch86].  The  POSTGRES  data 
model  can  be  used  to  support  non-first  normal  form  relations  with 
procedure-types.  Consequently,  POSTGRES  seems  to  contain  a  superset  of 
the  capabilities  of  these  proposals. 

Object-oriented  data  models  [CoM84,Mai86]  have  modeling  constructs 
to  deal  with  uncertain  structure.  For  example,  GemStone  supports  union 
types  which  can  be  used  to  represent  subobjects  that  have  different  types 
[CoM84].  Sharing  of  subobjects  is  represented  by  storing  the  subobjects  as 
separate  records  and  connecting  them  to  a  parent  object  with  pointer-chains. 
Precomputed  procedure  values  will,  in  our  opinion,  make  POSTGRES  per¬ 
formance  competitive  with  pointer-chain  proposals.  The  preformance  prob¬ 
lem  with  pointer-chains  will  be  most  obvious  when  an  object  is  composed  of 
a  large  number  of  subobjects.  POSTGRES  will  avoid  this  problem  because 
the  pointer-chain  is  represented  as  a  relation  and  the  system  can  use  all  of 
the  query  processing  and  storage  structure  techniques  available  in  the  sys¬ 
tem  to  represent  it  Consequently,  POSTGRES  uses  a  different  approach 
that  supports  the  same  modeling  capabilities  and  an  implementation  that 
may  have  better  performance. 

Finally,  the  POSTGRES  data  model  could  claim  to  be  object-oriented, 
though  we  prefer  not  to  use  this  word  because  few  people  agree  on  exactly 
what  it  means.  The  data  model  provides  the  same  capabilities  as  an  object- 
oriented  model,  but  it  does  so  without  discarding  the  relational  model  and 
without  having  to  introduce  a  new  confusing  terminology. 


S3 


6.  Summary 

The  POSTGRES  data  model  usee  the  idea*  of  abstract  data  types,  data 
of  type  procedure,  and  inheritance  to  extend  the  relational  model  These 
Ideas  can  ^e  used  to  simulate  a  variety  of  semantic  data  modeling  concepts 
(04.,  aggregation  and  generalization).  In  addition,  the  same  ideas  can  be 
used  to  support  complex  objects  that  have  unpredictable  composition  and 
shared  subobjects. 
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